Document Level Subjectivity Classification Experiments in DEFT’09 Challenge

نویسندگان

  • Cigdem Toprak
  • Iryna Gurevych
چکیده

Cet article présente nos expériences de classification supervisée pour la subjectivité au niveau des documents, pour l’anglais et pour le français, au cours du Défi DEFT’09 de fouille de textes. Nous avons testé des traits portant sur les mots, les parties du discours et sur des vocabulaires spécialisés pour faire fonctionner un classificateur SVM. Nos expériences sur les traits des mots examinent d’une part l’utilité de l’information contextuelle, et procèdent d’autre part à une comparaison, sur cette tâche, entre les représentations binaires et tf*idf. Nous montrons que des distributions différentes pour les classes privilégient des représentations différentes pour les traits. Puis, sur l’anglais, nous comparons trois vocabulaires spécialisés dans l’expression des opinions, deux d’entre eux étant bien connus. Ce sont les indices de subjectivité de (Wiebe et Riloff, 2005; Wilson et al., 2005), SentiWordNet (Esuli et Sebastiani, 2006), et une liste de verbes compilée à partir de (Santini, 2007 ; Biber et al., 1999). Malgré sa faible couverture, ce lexique de 156 verbes donne d’assez bons résultats pour l’anglais.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genre classification using Balanced Winnow in the DEFT 2014 challenge

In this report we present the work done on the first subtask of the DEFT 2014 challenge which dealt with genre classification of French literary texts. In our approach we developed three types of features : lemmatized words, stylometric features and features that incorporate some form of world knowledge. Subsequent classification experiments were performed using the Balanced Winnow classifier. ...

متن کامل

MHSubLex: Using Metaheuristic Methods for Subjectivity Classification of Microblogs

In Web 2.0, people are free to share their experiences, views, and opinions. One of the problems that arises in web 2.0 is the sentiment analysis of texts produced by users in outlets such as Twitter. One of main the tasks of sentiment analysis is subjectivity classification. Our aim is to classify the subjectivity of Tweets. To this end, we create subjectivity lexicons in which the words into ...

متن کامل

Systèmes du LIA à DEFT'13

The Systems of LIA at DEFT’13 The 2013 Défi de Fouille de Textes (DEFT) campaign is interested in two types of language analysis tasks, the document classification and the information extraction in the specialized domain of cuisine recipes. We present the systems that the LIA has used in DEFT 2013. Our systems show interesting results, even though the complexity of the proposed tasks. MOTS-CLÉS...

متن کامل

Is a Voting Approach Accurate for Opinion Mining?

In this paper, we focus on classifying documents according to opinion and value judgment they contain. The main originality of our approach is to combine linguistic pre-processing, classification and a voting system using several classification methods. In this context, the relevant representation of the documents allows to determine the features for storing textual data in data warehouses. The...

متن کامل

Multimodal Subjectivity Analysis of Multiparty Conversation

We investigate the combination of several sources of information for the purpose of subjectivity recognition and polarity classification in meetings. We focus on features from two modalities, transcribed words and acoustics, and we compare the performance of three different textual representations: words, characters, and phonemes. Our experiments show that character-level features outperform wo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009